Not-so-clevr: Visual Relations Strain Feed-

ثبت نشده
چکیده

The robust and efficient recognition of visual relations in images is a hallmark of biological vision. Here, we argue that, despite recent progress in visual recognition, modern machine vision algorithms are severely limited in their ability to learn visual relations. Through controlled experiments, we demonstrate that visual-relation problems strain convolutional neural networks (CNNs). The networks eventually break altogether when rote memorization becomes impossible such as when the intra-class variability exceeds their capacity. We further show that another type of feedforward network, called a relational network (RN), which was shown to successfully solve seemingly difficult visual question answering (VQA) problems on the CLEVR datasets, suffers similar limitations. Motivated by the comparable success of biological vision, we argue that feedback mechanisms including working memory and attention are the key computational components underlying abstract visual reasoning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Not-so-clevr: Visual Relations Strain Feed-

The robust and efficient recognition of visual relations in images is a hallmark of biological vision. Here, we argue that, despite recent progresses in visual recognition, modern machine vision algorithms are severely limited in their ability to learn visual relations. Through controlled experiments, we demonstrate that visual-relation problems strain convolutional neural networks (CNNs). The ...

متن کامل

Not-So-CLEVR: Visual Relations Strain Feedforward Neural Networks

The robust and efficient recognition of visual relations in images is a hallmark of biological vision. Here, we argue that, despite recent progress in visual recognition, modern machine vision algorithms are severely limited in their ability to learn visual relations. Through controlled experiments, we demonstrate that visual-relation problems strain convolutional neural networks (CNNs). The ne...

متن کامل

Benchmark Visual Question Answer Models by using Focus Map

Inferring and Executing Programs for Visual Reasoning proposes a model for visual reasoning that consists of a program generator and an execution engine to avoid endto-end models. To show that the model actually learn which objects to focus on to answer the questions, the authors give a visualizations of the norm of the gradient of the sum of the predicted answer scores with respect to the fina...

متن کامل

آنالیزتنشهای مکانیکی و حرارتی در اسپیندل ماشینهای تراش

Dimensional accuracy in machined parts depends on the precision of spindle, which is highly affected by applied forces, itself. This precision of spindle becomes more serious when it is used for a period of long times. Therefore, stress and strain analysis of spindle is very important in the behavior and preservation of its precision. In this paper, the forces applied to the spindle of a turnin...

متن کامل

Strain Hardening Analysis for M-P Interaction in Metallic Beam of T-Section

This paper derives kinematic admissible bending moment – axial force (M-P) interaction relations for mild steel by considering strain hardening idealisations. Two models for strain hardening – Linear and parabolic have been considered, the parabolic model being closer to the experiments. The interaction relations can predict strains, which is not possible in a rigid, perfectly plastic idealizat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017